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Abstract 

This paper deals with a source separation strategy based on second- 
order statistics, namely, on data covariance matrices estimated at several 
lags. In general, "blind" approaches to source separation do not assume 
any knowledge on the mixing operator; however, any prior information 
about the possible structure of the mixing operator can improve the so- 
lution. Unlike ICA blind separation approaches, where mutual indepen- 
dence between the sources is assumed, our method only needs to constrain 
second-order statistics, and is effective even if the original sources are sig- 
nificantly correlated. Besides the mixing matrix, our strategy is also capa- 
ble to evaluate the source covariance functions at several lags. Moreover, 
once the mixing parameters have been identified, a simple deconvolution 
can be used to estimate the probability density functions of the source 
processes. To benchmark our algorithm, we used a database that sim- 
ulates the one expected from the instruments that will operate onboard 
ESA's Planck Surveyor Satellite to measure the CMB anisotropies all over 
the celestial sphere. The assumption was made that the emission spectra 
of the galactic foregrounds can be parametrised, thus reducing the num- 
ber of unknowns for system identification to the number of the foreground 
radiations. We performed separation in several sky patches, featuring dif- 
ferent levels of galactic contamination to the CMB, and assuming several 
noise levels, including the ones derived from the Planck specifications. 

Keywords: methods: statistical - techniques: image processing - cosmic 
microwave background. 

1 Introduction 

Separating the individual radiations from the measured signals is a common 
problem in astrophysical data analysis .27 ■ As an example, in cosmic microwave 
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background anisotropy surveys, the cosmological signal is normally combined 
with foreground radiations from both extragalactic and galactic sources, such 
as the Sunyaev-Zeldovich effects from clusters of galaxies, the effect of the in- 
dividual galaxies, the emission from galactic dust, the galactic synchrotron and 
free-free emissions. If one is only interested in estimating the CMB anisotropics, 
the interfering signals can just be treated as noise, and reduced by suitable can- 
cellation procedures. However, the foregrounds have an interest of their own, 
and it could be useful to extract all of them from multichannel data, by exploit- 
ing their different emission spectra. 

Some authors mUE] have tried to extract a number of individual radia- 
tion data from measurements on different frequency channels, assuming that 
the physical mixture model is perfectly known. Unfortunately, such an assump- 
tion is rather unrealistic and could overconstrain the problem, thus leading to 
unphysical solutions. Attempts have been made to avoid this shortcoming by 
introducing criteria to evaluate a posteriori the closeness to reality of the mix- 
ture model and allowing individual sources to be split into separate templates 
to take spatial parameter variability into account |21|[S]. 

A class of techniques capable of estimating the source signals as well as identi- 
fying the mixture model has recently been proposed in astrophysics P]|23 | |3||16 | . 
In digital signal processing, these techniques are referred to as blind source sep- 
aration (BSS) and rely on statistical assumptions on the source signals. In 
particular, mutual independence and nongaussianity of the source processes are 
often required [2U]. This totally blind approach, denoted as independent compo- 
nent analysis (ICA), has already given promising results, proving to be a valid 
alternative to assuming a known data model. On the other hand, most ICA 
algorithms do not permit to introduce prior information. Since all available 
information should always be used, semi-blind techniques are being studied to 
make astrophysical source separation more flexible with respect to the specific 
knowledge often available in this type of problem [22]. Moreover, the inde- 
pendence assumption is not always justified; if there is evidence of correlation 
between pairs of sources, it should be made possible to take this information 
into account, thus abandoning the strict ICA approach. 

The first blind technique proposed to solve the separation problem in as- 
trophysics j3j was based on ICA, and allowed simultaneous model identification 
and signal estimation to be performed. The independence requirement was ful- 
filled by taking the statistics of all orders into account, as in all ICA methods 
presented in the literature (see for example |i;-i | [2T) ] 'l. 

The problem of estimating all the model parameters and source signals can- 
not be solved by just using second-order statistics, since these are only able to 
enforce uncorrelation. However, this has been done in special cases, where addi- 
tional hypotheses on the spatial correlations or, equivalcntly, on the spectra of 
the individual signals are assumed HH1E3I [101 • As will be clear in the following, 
within the framework of any noisy linear mixture model, the data covariancc 
matrix at a particular lag is related to the source covariance matrix at the same 
lag, the mixing matrix, and the noise covariance matrix. If there is a sufficient 
number of lags for which the source covariance matrices are not null, then it 
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is possible to identify the model parameters by estimating the data covariancc 
matrices from the observed data. Indeed, assuming to know the noise covariance 
matrix, we are able to write a number of relationships from which the unknown 
parameters can be estimated. This is what is done by the second-order blind 
identification (SOBI) algorithm presented in ^Uj. SOBI, however, relies on joint 
digonalization of covariance matrices at different lags, which is only applicable 
in the case of uncorrelated source signals. In our approach, we assumed that the 
mixing matrix can be parametrised. This allows us to relax the independence 
assumption, and to pursue identification by optimization of a suitable function. 
A further advantage of this strategy is that the relevant correlation coefficients 
between pairs of sources can also be estimated. In the particular case of sepa- 
rating astrophysical foregrounds from cosmic microwave background, moreover, 
the relevant constraints are such that the total number of parameters to be esti- 
mated can substantially be reduced. This permits to improve the performance 
of our technique. We will show that, even assuming full covariance matrices at 
different lags, a very fast model learning algorithm can be devised, matching the 
theoretical covariance matrices to the ones estimated from the observed data. 

The paper is organised as follows. In Section |2J we formalise the problem 
and introduce the relevant notation. In Section [21 we describe how the mixing 
matrix can be parametrised in our case. In Sections 0] and \5\ we describe the 
methods we used to learn the mixing model and to estimate the original sources, 
respectively. In Section we present some experimental results, with both 
stationary and nonstationary noise. In the final section, we give some remarks 
and future directions. 

2 Problem statement 

As usual PIP], we assume that each radiation process s c (£,?7, v) from the 
microwave sky has a spatial pattern s c (£, rj) that is independent of its frequency 
spectrum F c {u): 

a D %rt,v) = s c {t,rj)F c {v) (1) 

Here, £ and rj are angular coordinates on the celestial sphere, and v is frequency. 
The total radiation observed in a certain direction at a certain frequency is 
given by the sum of a number N of signals (processes, or components) of the 
type (JTJ, where subscript c has the meaning of a process index. Assuming 
that the effects of the telescope beam on the angular resolution at different 
measurement channels have been equalised (see EH]); the observed signal at M 
different frequencies can be modelled as 

x(£,t?) =As(£,77)+n(£,77) (2) 

where x={xd, d = M} is the M-vector of the observations, d being a 

channel index, A is an M x N mixing matrix, s = {s c , c = 1, . . . , N} is the 
A-vector of the individual source processes and n—{nd, d — 1, . . . ,M} is the 
M-vector of instrumental noise. The elements of A are related to the source 
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spectra and to the frequency responses through the following formula: 



where bd{v) is the instrumental frequency response in the d-th measurement 
channel, which is normally known very well. If we assume that the source 
spectra are constant within the passbands of the different channels, equation 
© can be rewritten as 



The element adc is thus proportional to the spectrum of the c-th source at the 
center-frequency of the d-th channel. The separation problem consists in 
estimating the source vector s from the observed vector x. Several estimation 
algorithms have been derived assuming a perfect knowledge of the mixing ma- 
trix. As already said, however, this matrix is related to both the instrumental 
frequency responses, which are known, and the emission spectra F c (i>), which 
are normally unknown. For this reason, relying on an assumed mutual indepen- 
dence of the source processes s c (£,??), some blind separation algorithms have 
been proposed [22 123 , which are able to estimate both the mixing matrix 
and the source vector. Assuming that the source signals are mutually indepen- 
dent, the MN mixing coefficients can be estimated by finding a linear mixture 
that, when applied to the data vector, nullifies the cross-cumulants of all orders. 
If, however, some prior information allows us to reduce the number of unknowns, 
the identification problem can be solved by only using second-order statistics. 
This is the case with our approach, which is based on a parametrisation of ma- 
trix A . This approach, described in Section 0] does not need a strict mutual 
independence assumption. Logically any blind separation algorithm is divided 
into two phases: using the notation introduced here, the estimation of A will 
be referred to as system identification (or model learning), and the estimation 
of s will be referred to as source separation. In this paper, we first address 
aspects related to learning, and then give some details on source separation 
strategies derived from standard reconstruction procedures. Before describing 
our algorithm in detail, we recall here some applicability issues. 
Source and noise processes. To estimate the covariance matrices from the 
available data, the source and the noise processes must necessarily be assumed 
stationary. While CMB satisfies this assumption, the foregrounds are not sta- 
tionary all over the celestial sphere. This assumption can be made for small 
sky patches. However, depending on the particular sky scanning strategy, noise 
is normally nonstationary, even within small patches, and can also be auto- 
correlated. The noise covariance function should be known for any shift and 
for any angular coordinate in the celestial sphere. Provided that the noise 
nonstationarity and cross-correlation between sources can be neglected, various 
methods are available, both in space and frequency domain, to estimate sam- 
ples of the noise covariance function or, equivalently, of noise spectrum |16| . 
Tackling the space- variant nature of the noise process is difficult, and no simple 
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method has been proposed so far to this purpose. In [22] the noise variance at 
each pixel is assumed to be known and a method is proposed to estimate the 
mixing matrix and the probability density function of each component. In the 
present approach, we found experimentally that, if a noise covariance map is 
known, even nonstationary noise can be treated. 

Frequency dependent telescope beams. The model assumed in is valid 
if the telescope radiation patterns are the same in all the frequency channels. As 
the beams are frequency dependent, a way to tackle the problem is to preprocess 
the observed data in order to equalise the resolution on all the measurement 
channels, as in 26 . This also changes the autocorrelation function of each noise 
process, but in a way that can be exactly evaluated. A different way to tackle the 
problem has been to approach it in the frequency domain 19 16j. Also in these 
cases, the validity of the solution relies on a number of simplifiying assumptions, 
such as the perfect circular symmetry of the telescope beams. Moreover, the 
actual capability of extrapolating the spectrum at spatial frequencies where 
reduced information is available has still to be assessed, especially in the cases 
where the signal-to-noise ratio is particularly low. 

Structure of the source covariance matrices. In the Planck experiment, 
the sources of interest are the CMB signal and the foregrounds. While no cor- 
relation is expected between the CMB signal and foregrounds, some statistical 
dependence between pairs of foregrounds has to be taken into account. The off- 
diagonal entries of the source covariance matrices related to pairs of correlated 
sources will thus be nonzero, whereas all the remaining off-diagonal elements 
will be zero. When it is known that some of the cross-covariances are close to 
zero, these can be kept fixed at zero, thus further reducing the total number of 
unknowns. For instance, in a 3 x 3 case, if we assume the following structure 
for the source covariance matrix at zero-shift: 



this means that we assume zero or negligible correlations between sources 1 and 
2 and sources 1 and 3, and the remaining cross-covariance er 2 3 — 032 between 
sources 2 and 3 is an unknown of the problem, along with the autocovariances 
an. Note that, for the typical scaling ambiguity of the blind identification 
problem, the absolute values of both the diagonal and off-diagonal elements of 
matrices C s (r, if)) have no physical significance, while, by calculating ratios of 
the type 



we can actually estimate the correlation coefficients between different sources, 
whatever the values of the individual covariances. 
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3 Parametrisation of the mixing matrix 



While in a general source separation problem the elements a^c are totally un- 
known, in our case we have some knowledge about them. In fact, the integral in 
10} is related to known instrumental features and to the emission spectra of the 
single source processes, on which we do have some knowledge. As an example, 
if the observations are made in the microwave and millimeter-wave range, the 
dominant radiations are the cosmic microwave background, the galactic dust, 
the free-free emission and the synchrotron (see ^Tj). Another significant signal 
comes from the extragalactic radio sources. Here we assume that the latter has 
been removed from the data by one of the specific techniques proposed in the 
literature [2HIC31 E23- As a matter of fact, these techniques cannot remove to- 
tally the extragalactic point sources, but they remove the brightest ones (which 
are the most important, since they significantly affect the the study of the CMB, 
see [HO]). As far as the other signals are concerned, the emission spectrum of the 
cosmic microwave background is perfectly known, being a blackbody radiation. 
In terms of antenna temperature, it is: 

[exp(z/) - 1] 

where v is the frequency in GHz divided by 56.8. From Q and Q, the column 
of A related to the CMB radiation is thus known up to an unessential scale 
factor. For the synchrotron radiation, we have 

F syn (y) (x v~ n ' (8) 

Thus, the column of A related to synchrotron only depends on a scale factor 
and the spectral index n s . For the thermal galactic dust, we have 

Fdust{v) oc — (9) 

exp(^) — 1 

where v = hv /kTdust, h is the Planck constant, k is the Boltzmann constant 
and Tdust is the physical dust temperature. If we assume a uniform temperature 
value, the frequency law ©, that is, the column of A related to dust emission, 
only depends on a scale factor and the parameter m. 

The above properties enable us to describe the mixing matrix by means of 
just a few parameters. As an example, if we assume to have a perfectly known 
source spectrum (such as the one of CMB) and N — 1 sources with one-parameter 
spectra, the number of unknowns in the identification problem is N — 1 instead 
of NM. 

For the sake of simplicity, although other foregrounds (such as SZ and free- 
free) could be taken into account, in our experiments we only considered syn- 
chrotron and dust emissions, which are the most significant in the Planck fre- 
quency range. 
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4 A second-order identification algorithm 



Let us consider the source and noise signals in © as realisations of two station- 
ary vector random processes. The covariance matrices of these processes are, 
respectively, 

C.(r,V) = ([s(e,r ? )- Ats ][s(e + r,r / + V)-^] T ) , (10) 
C„(r,V0 - ([n(e,r ? )- Ain ][n(e + r,7 7 + V)-Mn] T } (11) 

where (.) denotes expectation under the appropriate joint probability, fi s and fi n 
are the mean vectors of processes s and n, respectively, and the superscript T 
means transposition. As usual, the noise process is assumed signal-independent, 
white and zero-mean, with known variances. Thus, for both r and tf> equal to 
zero, C n is a known diagonal matrix whose elements are the noise variances in 
all the measurement channels, whereas for any r or tjj different from zero C ra is 
the null M x M matrix. 

As already proved [H] , covariance matrices, i.e. second-order statistics, 
permit blind separation to be achieved when the sources show a spatial structure, 
namely, when they are spatially correlated. Thus, the mutual independence 
requirement of ICA can be replaced by an equivalent requirement on the spatial 
structure of the signal, and the identifiability of the system is assured. In other 
words, finding matrices A and C s is generally not possible from covariances at 
zero shift alone; to identify the mixing operator, either higher-order statistics 
or the covariance matrices at several nonzero shift pairs (r, ip) must be taken 
into account. Of course, this is also a requirement on the sources, since if the 
covariance matrices are null for any pair (r,ip), identification is not possible. 
This aspect will become clearer below. 

Let us now see our approach to system identification. By exploiting equation 
J2J), the covariance of the observed data can be written as: 

C x (t,V)= ([x(£,77)-/^][x(£ + T,7/ + ^)-/i :c ] T } = 

= AC s (r,^)A T + C„(r,V) • (12) 

Since C x (T,tp) can be estimated from 

C x (r, 4>) = ±- [*& V) ~ Mx] [x(£ + r, r, + </>) - Mxf , (13) 

where N p is the number of pixels. Equation i|12[l provides a number of in- 
dependent nonlinear relationships that can be used to estimate both A and 
C s . Obviously, this possibility does not rely on mutual independence between 
the source signals, as required by the ICA approach: the only requirement is 
having a sufficient number of nonzero covariance matrices. In other words, spa- 
tial structure can be used in the place of mutual independece as a basis for 
model learning and signal separation. As assumed in the previous section, in 
this particular application the number of unknowns is reduced by parametris- 
ing the mixing matrix. This allows us to solve the identification problem from 
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the relationships made available by Equation H12|) by only using the zero-shift 
covariance matrix, even if some of the sources are cross-correlated. We investi- 
gated this possibility in 0. In a general case, matrices A and C s (r, -0) can be 
estimated from 

(r, S(r, ip)) = arg min|| A(r)c s (E(r^))A T (r)-c x (r,vO-c„(r^)||. (14) 

The minimisation is performed over vectors T and E, and for all available values 
of the pairs (t, ip), where T is the vector of all the parameters defining A (pos- 
sibly consisting in all the matrix elements), and S(t, ip) is the vector containing 
all the unknown elements of matrices C s for every shift pair. The matrix norm 
adopted is the Frobenius norm. Our present strategy to find the minimiser in 
l|14l) is to alternate a componentwise minimisation in T with fixed C s , and the 
evaluation of C s , whose elements for each (t, ip) can be calculated exactly once 
A is fixed. A more accurate minimisation strategy is now being studied. 

From the above scheme, it is clear that for each independent element of 
the matrices C x (r, ip) we have an independent equation for the estimation of 
vector r and of all the vextors S(t, ip). Since for (r, ip) = (0,0) matrix C x is 
symmetric, for zero shift we have M(M + l)/2 independent equations. For any 
other shift pair, C x is a general matrix and thus, provided that it is not zero, 
we have M 2 additional independent equations. If N s is the total number of 
nonzero shift pairs generating nonzero data covariance matrices, we thus have 
a total number of M(M + l)/2 + N s - M 2 = M[(2N S + l)M + l]/2 independent 
equations . The number of unknowns is at most NAI + N(N + 1 ) / 2 + N s - N 2 , in 
the case where all the elements of A are unknown and all the source covariance 
matrices are full, that is, all the sources at any shift are correlated to each other. 
Note that, in this worst case situation, if it is M ~ N, we always have TV 2 more 
unknowns than equations, independently of N s . As soon as we have M > N, 
there is always a number of nonzero shift pairs for which we have more inde- 
pendent equations than unknowns to be estimated. This observation gives an 
idea of the amount of information we have available for our estimation problem. 
The number of independent equations affects the behaviour of the nonlinear 
optimization landscape in 114|) . Qualitatively, we can affirm that the more in- 
dependent equations we have, the more well-posed the optimization problem 
will be. In particular, it is likely that in absence of any prior information about 
the structure of A and C s (t, ip) having a number of observed channels equal to 
the number of sources always leads to insufficient information, independently 
of the number of shift pairs chosen. If, instead, the number of the available 
observations is larger than the number of sources, the possibility of estimating 
the unknowns relies on the number of shift pairs for which the data covariance 
matrices are nonzero. The availability of prior information, as in the application 
considered here, can of course alleviate these requirements. For example, if we 
have a 4 x 4 mixing matrix only depending on four parameters and only two 
sources significantly correlated, the unknowns to be determined are 4 + 5 + iV s -6, 
by using a maximum of M(M + l)/2 + N s ■ M 2 equations. This means that in 
this case, as soon as M — 4, the number of independent equations is larger than 
the number of unknowns even for N s = 0. 
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5 Signal separation strategy 



Model learning is only the first step in solving the problem of source separation. 
Although, in principle, one could simply use multichannel inverse filtering to 
recover the source maps, this approach is not feasible in practice, for the presence 
of noise. In our treatment, the data are assumed to be an ergodic process, in 
order to be able to evaluate its statistics from the available sample. This entails 
a space invariant noise process. The estimation of the individual source maps 
should be made on the basis of all the products of the learning stage. In our case, 
we have estimates of the mixing matrix and of the source covariance matrices at 
several shift pairs. In the hypothesis of stationary noise, we could exploit this 
information to implement a multichannel Wiener filter for source reconstruction. 
If the noise is not stationary, a generalized Kalman filter should be used. Our 
point here is on model learning, and thus we do not address the separation issues 
in detail. We only observe that a possible Bayesian separation scheme would 
make use of the source probability densities, and these can be estimated from 
our mixing matrix. Indeed, let us assume that our learning procedure has given 
a good estimate of A. Let B be its Moore-Penrose generalised inverse. In our 
case we have M > N, thus, as is known, 

B = (A T A)~ 1 A T . (15) 

From J2J) we have 

Bx=s+Bn (16) 

Let us denote each of the N rows of B as an M- vector b^, i = 1, . . . , N, and 
consider the generic element yi of the A-vector Bx, 

yi := hj • x = Si + bj n := s; + n u (17) 

The probability density function of g/j, p{yi), can be estimated from and the 
data record x(£, rj), while the probability density function of n ti , p(n ti ), is a 
Gaussian, whose parameters can be easily derived from C„ and hi. The pdf of 
yi is the convolution between p{si) and p(ntj: 

PiVi) =P(si)*p(n ti ), (18) 

From this relationship, p(si) can be estimated by deconvolution. As is well 
known, deconvolution is normally an ill-posed problem and, as such, it lacks 
a stable solution. In our case, we can regularise it by enforcing smoothness, 
positivity, and the normalisation condition for pdfs. 

Any Bayesian estimation approach should exploit the knowledge of the source 
densities to regularise the solution, but these are normally unknown. In the case 
examined here, the source distributions can be efficiently estimated as summa- 
rized above, and the computational cost of otherwise expensive Bayesian algo- 
rithms can be reduced. As an example, in [22], the source densities are modelled 
as mixtures of Gaussians, and the related parameters are estimated by an in- 
dependent factor analysis approach (see |H][2])- ^he method we propose here 
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Figure 1: Source maps from a 15° x 15° patch centered at 0° galactic latitude 
and 40° galactic longitude, at 100 GHz: a) CMB; b) synchrotron; c) thermal 
dust. 

could well be used to fix the source densities, thus reducing the overall cost of 
the identification-separation task. 

From Equation l|16l) . it can be seen that the generalised inverse solution is 
already an estimate of the sources, since it is composed of the original source 
vectors corrupted by amplified noise. Thus, a simple source estimation strategy 
could be first to apply Equation l(Po|) and then suitably filter the result, to 
reduce the influence of noise. In next section, we show some experimental results 
obtained by pseudoinversion of the estimated mixing matrix, followed by Wiener 
filtering of each individual source. This strategy would be strictly valid with 
stationary noise and high signal-to-noise-ratio, however, interesting results have 
been found even with strong nonstationary noise. Multichannel Wiener filtering 
for stationary noise and an extended Kalman filter for the nonstationary case 
are now being developed. 

6 Experimental results 

In this section, we present some results from our extensive experimentation 
with the method described above. Our data were drawn from a data set that 
somulates the one expected from the Planck surveyor satellite (see the Planck 
homepage 1 ). The source maps we considered were the CMB anisotropy, the 
galactic synchrotron and thermal dust emissions over the four measurement 
channels centred at 30 GHz, 44 GHz, 70 GHz and 100 GHz. The test data 
maps have been generated by extracting several sky patches at different galactic 
coordinates from the simulated database, scaling them exactly according to 
formulas 0-©, generating the mixtures for the channels chosen, and adding 
realisations of Gaussian, signal independent, white noise. Several noise levels 
have been used, from a ten percent to more than one hundred percent of the 
CMB standard deviation. The range chosen contains noise levels within the 
Planck specifications. Although our method would be only suited for uniform 

1 http: / /astro. estec.esa.nl/SA-general/Projects/Planck/ 
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Figure 2: Noisy data maps at a) 100 GHz; b) 70 GHz; c) 44 GHz; d) 30 GHz. 



noise, we also tried to apply it to data corrupted by nonuniform noise, and 
obtained promising results. 

Within this section, we will divide the results obtained in model learning 
from the results in separation, and the cases with stationary noise from those 
with nonstationary noise. In these latter cases, knowledge of a noise variance 
map is assumed, and the additional problem arises of choosing the appropriate 
noise covariance matrix. 

The results from learning are the mixing matrix and the source covariance 
matrices at the shift pairs chosen. From the estimate of the mixing matrix, it is 
also possible to derive the marginal source densities, by using relationships l|17fl 
and (|18|l . We have seen that the results under this aspect are more sensitive 
than others to noise, and the choice of the regularization parameters is quite 
critical. 

Our separation results are all derived from the application of the Moore- 
Penrose pseudoinverse of the estimated mixing matrix, followed by a classical 
Wiener filtering on each output image. From this processing, estimates of the 
source maps are obtained. Also, estimated source power spectra can be obtained 
from either the maps or the source autocorrelation matrices. In particular, the 
results we show here are derived from the unfiltered pseudoinverse solutions, 
showing that, although the reconstructed images are heavily affected by noise, 
the derived power spectra can be corrected for the theoretical noise spectrum 
and thus estimated quite accurately. 

The results presented here will all be related to a single data record, derived 
from a simulated 15° x 15° sky patch centered at 40° galactic longitude and 0° 
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Figure 3: Norm of the residual in eq. (|14f) as a function of the iteration number. 

galactic latitude. It is to be noted that in such a patch, located on the galactic 
plane, the measured data will be affected by strong foreground interference, thus 
making the problem very difficult to solve. Indeed, many separation approaches 
experimented so far simply fail in proximity of the galactic plane, and they are 
normally applied after masking the all-sky data in the high-interference regions. 
It is to remark that our method failed with sky patches taken at high galactic 
latitudes, where the only dominant signal is the CMB, and the foregrounds are 
often well below the noise level. Some other techniques, such as ICA (see [3]), 
did obtain good results even in these regions, but the noise levels introduced 
in those cases were much lower than the ones we have used in this work. In 
these regions, furthermore, CMB is almost the only measured radiation at the 
considered frequencies, and is estimated very well with all the assigned signal- 
to-noise ratios. At lower galactic latitudes, conversely, the situation is rather 
different. Here, the dust emission is stronger than CMB, and separation is 
strictly necessary if CMB is to be distinguished from the foregrounds. Our 
method performed very well with these data, and all the relevant parameters 
were satisfactorily estimated even with the strongest noise components. The 
noise standard deviation we adopted in the case shown here is 30% the standard 
deviation of CMB at 100 GHz. The noise level in the other channels has been 
simply obtained by scaling the level at 100 GHz in accordance to the expected 
Planck sensitivity at those frequencies. For each patch considered, we tried 
different noise levels, up to more than 100% of the CMB level at 100 GHz, and 
for each noise level, we performed a Monte Carlo simulation with hundreds of 
different noise realizations. The results of this analysis are not shown here, but 
we can say that no significant bias has been found in the results. 

In Figure^ we show the three source maps we used in the situation described 
above. In this figure and in all the others shown here, the grayscale is linear with 
black corresponding to the maximum image value. We assigned the sources Si to 
CMB, S2 to synchrotron and s 3 to dust, and the signals X\, X2, £3 and X4 to the 
measurement channels at 100, 70, 44 and 30 GHz, respectively. Therefore, the 
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Figure 4: Real (dotted) and estimated (solid) source density functions for a) 
CMB; b) synchrotron; c) dust. 



first, second, and third columns of the mixing matrix will be related to CMB, 
synchrotron and dust, respectively, and the first, second, third, and fourth rows 
of the mixing matrix will be related to the 100 GHz, 70 GHz, 44 GHz, and 
30 GHz channels, respectively. The mixing matrix, A D , has been derived from 
equations 0-©, with spectral indices n s = 2.9 and m = 1.8 (see for example 
and [IB]): 

1 1 \ 

2.8133 0.5485 

10.8140 0.2464 

32.8359 0.1260 



V 



1 

1.1353 
1.2241 
1.2570 



(19) 



/ 



In figure|21 we show the data maps for stationary noise. Also, note that the 
case examined does not fit the ICA assumptions. For example, the normalized 
source covariance matrix at zero shift is: 



C s (0,0) 



1.0000 
0.1961 
0.0985 



0.1961 0.0985 
1.0000 0.6495 
0.6495 1.0000 



(20) 



where a significant correlation, of the order of 65%, can be observed between 
the dust and synchrotron maps. 

For the data described above, we ran our learning algorithm for 500 different 
noise realisations; for each run, 10000 iterations of the minimisation procedure 
described in the previous section were performed. The unknown parameters 
were the spectral indices n s and m, and all the elements of matrices C s (r, ip). 
The cost defined in (|14|) , as a function of the iteration number in a particular run, 
is shown in figure |3 The typical elapsed times per run were a few minutes on 
a 2 GHz CPU computer, with a Matlab interpreted code. In the case described 
here, we estimated n s = 2.8985 and m = 1.7957, corresponding to the mixing 
matrix 

/ 1 1 1 \ 

1.1353 2.8118 0.5494 
1.2241 10.8009 0.2473 
\ 1.2570 32.7775 0.1267 / 



(21) 
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As a quality index for our estimation, we adopted the matrix Q=(A T C~ 1 A) _1 
(A T C~ 1 A ), which, in the ideal case, should be the N x N identity matrix I. 
In the present case, we have: 

1.0000 -0.0074 -0.0013 
Q = ( 0.0000 1.0020 0.0000 | . (22) 
0.0000 0.0054 1.0013 

The Frobenius norm of matrix Q— I should be zero in the case of perfect model 
learning. In this case, it is 0.0096. 

These results have been found by considering 25 uniformly distributed shift 
pairs, with < r < 20 and < ip < 20. As a synthetic index for the quality 
of the reconstructed source covariance matrices, we adopted a matrix E, where 
each element is the relative error in the same covariance element, averaged over 
all the pairs (r, tp) : 

1 ^ |C siji (r,^)-C si ,(r,^)| 



where N s is the total number of shift pairs and C s are the estimated source 
covariance matrices. Of course, matrix Ij23(l is only defined when all the denom- 
inators are nonzero. A more accurate analysis of the results can be made from 
the element-by-element comparison of the estimated and the original matrices, 
but we do not report these results here. For the case shown above, we have: 

/ 0.0274 0.0392 0.0496 \ 
E = 0.0472 0.0170 0.0120 . (24) 
\ 0.0917 0.0125 0.0050 / 

The reconstructed probability density functions of the source processes, es- 
timated from equations (|17[) and (|18|) . are shown in figure 0| 

We separated the sources by multiplying the data matrix by the Moore- 
Penrose generalised inverse, as in i|16fl . and then by applying a Wiener filter to 
the results thus obtained. As already said, this is not the best choice reconstruc- 
tion algorithm at all, especially when the data are particularly noisy. However, 
the results we obtained are visually very good, as shown in figure To evalu- 
ate more quantitatively the results of the whole learning-separation procedure, 
we compared the power spectrum of the CMB map with the one of the recon- 
structed map. This comparison is shown in Figure El where we also show the 
possibility of correcting the reconstructed spectrum for the known theoretical 
spectrum of the noise component n tl , obtained as in (jl7JI . As can be seen, 
the reconstructed spectrum is very similar to the original within a multipole 
I = 2000. 

Strictly speaking, our algorithm could not be applied to nonstationary pro- 
cesses. However, let us assume that the original sources are stationary, and the 
noise is nonstationary but still spatially white and uncorrelated. This means 
that its covariance matrix, let us call it R. n (£,77), depends on the pixel. From 
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Figure 6: a) Real (dotted) and estimated (solid) CMB power spectra. The 
dashed line represents the theoretical power spectrum of the noise component 
n tl in l|17fl . evaluated from the noise covariance and the Moore-Penrose pseu- 
doinverse of the estimated mixing matrix, b) Real (dotted) and estimated (solid) 
CMB power spectrum, corrected for theoretical noise. 



15 




Figure 7: Map of noise standard deviations used to generate nonstationary data 




».i h( 4 

Figure 8: Wiener- filtered estimated maps from nonstationary data: a) CMB; b) 
synchrotron; c) dust. 
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our assumptions, these matrices are zero for any nonzero pair (r,ip). We tried 
our method on nonstationary data, by assuming to know R n (£,77), and using a 
covariance matrix given by 

C n (0,0) = -^^R n (^7 ? ) (25) 

The nonstationary data were obtained from a spatial template of noise standard 
deviations expected for typical Planck observations, shown in Figure [7| The 
actual standard deviations were adjusted so as to obtain the average signal-to- 
noise ratios desired for the different channels. The separation results for a case 
where these SNRs were the same as in the above stationary case are shown in 
FigureEl where the degradation in the reconstruction is apparent in the regions 
where the noise is stronger. The results, in terms of recontructed power spectra, 
are perfectly comparable to the ones exemplified in Figure El The estimated 
spectral indices were n s — 2.8885 and m = 1.7881, corresponding to the mixing 
matrix 

I \ 1 1 \ 

1.1353 2.8018 0.5509 , , 

1.2241 10.7128 0.2488 ' ^ ' 

\ 1.2570 32.3861 0.1279 / 

The average error on covariance matrices is in this case: 

/ 0.0158 0.1165 0.1930 \ 
E = 0.1163 0.0331 0.0254 . (27) 
\ 0.2440 0.0261 0.0144 / 

The Frobenius norm of matrix Q— I is now 0.0736, that is, slightly worse than 
for the above stationary case. 



7 Concluding remarks 

By exploiting the spatial structure of the sources, we developed an identifica- 
tion and separation algorithm that is able to exploit any available information 
on possible structure of the mixing matrix and the source covariance matrices. 
This can include the fully blind approach and the case exemplified here, where 
the mixing matrix is known to only depend on two parameters. The identifi- 
cation task is performed by a simple optimization strategy, while the proper 
separation can be faced by different approaches. We experimented the simplest 
one, but we are also developing more accurate techniques, especially suited to 
treat nonstationary noise on the data. 

Our method is suitable to work directly with all-sky maps, but it could be 
necessary to apply it to small patches, as is shown in the above experimental 
section, to cope with the expected variability of the spectral indices and the 
noise variances in different sky regions. 
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It has been observed that it does not make sense to try source separation 
in those regions where the foreground emissions are much smaller than CMB 
and well below the noise level. In any case, the CMB angular power spectrum 
has always been estimated fairly well up to a multipole I = 2000, irrespective 
of the galactic latitude. The estimation of the source densities has also given 
good results. Source separation by our method has been particularly interest- 
ing with data from low galactic latitudes, where the foreground variance is often 
higher than the one of the CMB signal. Note that many separation strategies, 
both blind and non-blind, have failed their goal in this region of the celestial 
sphere. As an example, WMAP data analysis (see JT]) was often performed 
by using pixel intensity masks that exclude the brightest sky portion from be- 
ing considered. Another interesting feature of our method is that significant 
cross-correlations between pairs of foregrounds can be straighforwardly taken 
into account. Recently, some methods for a completely blind separation of cor- 
related sources have been proposed in the literature (see for example 0). Their 
effectiveness in astrophysical map separation has not been proved yet. More- 
over, they have a high computational complexity. 

Recently |16j . a frequency domain implementation of the method in |10| has 
been proposed. This method allows to take antenna beam effects into account 
straightforwardly by including the effect of the antenna transfer functions in 
the model. It is also very flexible in introducing prior information about the 
entries of the mixing matrix and the spatial power spectra of the components. 
An open problem is the extension of these methods to the case of correlated 
sources. A possible extended method might be implemented in the space or in 
the frequency domain according to convenience. Another problem that is still 
open with the expected Planck data is the different resolution of the data maps 
in some of the measurement channels. The identification part of our method 
can work with maps whose resolution has been degraded in order to be the 
same in all the channels. The result should be an estimate of the mixing matrix, 
which can be used in any non-blind separation approach with channel-dependent 
resolution, such as maximum entropy |19|. However, the possible asymmetry 
of the telescope beam patterns should be taken into account in verifying this 
possibility. 
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